Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Duncan Blythe

Syntax-Aware Language Modeling with Recurrent Neural Networks

Mar 02, 2018

Duncan Blythe, Alan Akbik, Roland Vollgraf

Figure 1 for Syntax-Aware Language Modeling with Recurrent Neural Networks

Figure 2 for Syntax-Aware Language Modeling with Recurrent Neural Networks

Figure 3 for Syntax-Aware Language Modeling with Recurrent Neural Networks

Figure 4 for Syntax-Aware Language Modeling with Recurrent Neural Networks

Abstract:Neural language models (LMs) are typically trained using only lexical features, such as surface forms of words. In this paper, we argue this deprives the LM of crucial syntactic signals that can be detected at high confidence using existing parsers. We present a simple but highly effective approach for training neural LMs using both lexical and syntactic information, and a novel approach for applying such LMs to unparsed text using sequential Monte Carlo sampling. In experiments on a range of corpora and corpus sizes, we show our approach consistently outperforms standard lexical LMs in character-level language modeling; on the other hand, for word-level models the models are on a par with standard language models. These results indicate potential for expanding LMs beyond lexical surface features to higher-level NLP features for character-level models.

Via

Access Paper or Ask Questions

Regression for sets of polynomial equations

Mar 25, 2013

Franz Johannes Király, Paul von Bünau, Jan Saputra Müller, Duncan Blythe, Frank Meinecke, Klaus-Robert Müller

Figure 1 for Regression for sets of polynomial equations

Figure 2 for Regression for sets of polynomial equations

Figure 3 for Regression for sets of polynomial equations

Figure 4 for Regression for sets of polynomial equations

Abstract:We propose a method called ideal regression for approximating an arbitrary system of polynomial equations by a system of a particular type. Using techniques from approximate computational algebraic geometry, we show how we can solve ideal regression directly without resorting to numerical optimization. Ideal regression is useful whenever the solution to a learning problem can be described by a system of polynomial equations. As an example, we demonstrate how to formulate Stationary Subspace Analysis (SSA), a source separation problem, in terms of ideal regression, which also yields a consistent estimator for SSA. We then compare this estimator in simulations with previous optimization-based approaches for SSA.

* Journal of Machine Learning Research Workshop and Conference Proceedings Vol.22: Proceedings on the Fifteenth International Conference on Artificial Intelligence and Statistics, 22:628-637. 2012
* arXiv admin note: substantial text overlap with arXiv:1108.1483

Via

Access Paper or Ask Questions

Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

Aug 11, 2011

Duncan Blythe, Paul von Bünau, Frank Meinecke, Klaus-Robert Müller

Figure 1 for Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

Figure 2 for Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

Figure 3 for Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

Figure 4 for Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

Abstract:Detecting changes in high-dimensional time series is difficult because it involves the comparison of probability densities that need to be estimated from finite samples. In this paper, we present the first feature extraction method tailored to change point detection, which is based on an extended version of Stationary Subspace Analysis. We reduce the dimensionality of the data to the most non-stationary directions, which are most informative for detecting state changes in the time series. In extensive simulations on synthetic data we show that the accuracy of three change point detection algorithms is significantly increased by a prior feature extraction step. These findings are confirmed in an application to industrial fault monitoring.

* 24 pages, 20 figures, journal preprint

Via

Access Paper or Ask Questions